Overall Description of Lab 9

Due: 5:00PM Friday, March 15, 2019 on OSF

  1. You will individually:
    • join datasets
    • work with and manipulate strings
    • work with and manipulate factors
    • work with and manipulate dates and times
    • specify three underlying questions you are answering
    • generate interesting findings that answer those questions
    • create at least one plot that answers an interesting question
    • write a few sentences in the team report detailing what you did, including which functions from the tidyverse you used.
  2. As a team, you will produce a lab report in R Markdown. The lab report will include:
    • a series of “findings” and plots based on individuals’ work
    • an overall “most interesting” plot that answers a question
    • a section detailing which team members did what.

Overall Goal

Your team has been hired by NiceRide, Minneapolis’s Bicycle sharing service, to optimize their business. In their city, they have several docking stations that hold anywhere from 8-30 bicycles at a time. You can pick up a bicycle at one station, ride it somewhere, and leave it at another station. Every year, they log the data and post it on their website. You can get the data for 2017 here. Unzip the data and you will see three files: two datasets and one readme file.

The first dataset, Nice_Ride_2017_Station_Locations, contains information on the docking stations. There are about 200 stations, and they report the number of docks per station, latitude, and longitude. The second dataset, Nice_ride_trip_history_2017_season, contains a log for all rides in 2017. Check out the readme file for a desciption on this dataset.

You will also need the following libraries. Lubridate helps us work with dates and times. OpenStreetMap is a package that allows us to plot maps. I have included the CRAN page for lubridate and a tutorial on using OpenStreetMap for your reference:

Disclamer: OpenStreetMap only works on Windows machines. We we unable to find an easy fix for getting it to work on Mac computers. Even if the maps don’t work, you can stil plot latitudes and longitudes using the sample code below. If at least one member of your group has a PC, just have them compile the document with the map overlay. If nobody in your group has a PC, try to borrow one and compile the document there. As a last resort, and a last resort only, you can email me the document and I will compile it for you.

# Load libraries
library(tidyverse)
library(lubridate)
library(OpenStreetMap)

Plotting

For this datset, it would be really cool if we can make plots with an overlayed map. To do this, you will need to join the two datasets so that we can use the latitudes and longitudes of the stations for plotting. Note that we need the latitude and longitude for both the starting and ending stations, which requires two different joins. Once you have done this, we can make really interesting plots with this data. Here is an example of a couple.

Plot map of Minneapolis

# Set latitudes and longitudes of city map
LAT1 <- 44.88     # Do not change
LAT2 <- 45.05     # Do not change
LON1 <- -93.35    # Do not change
LON2 <- -93.08    # Do not change

# Generate map
map <- openmap(c(LAT2,LON1), c(LAT1,LON2), zoom = NULL, # Can change zoom
               type = "esri",                           # Can change
               mergeTiles = TRUE)                       # Do not change

# Project map to latitude and longitude
map.latlon <- openproj(map, projection = "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs") # Do not change

# Plot map of Minneapolis
autoplot(map.latlon) # Plots a map 

Plot the stations as points

# Plot all niceride stations
# If you are not plotting the map, use ggplot() instead of autoplot
#ggplot() +
autoplot(map.latlon) + # Plots a map 
  geom_point(data=locations, 
               aes(x=Longitude, y=Latitude), 
               color = 'blue', size = 1) +
  labs(x='Longitude', y='Latitude') +
  ggtitle('Locations of NiceRide Stations')

Plot rides for halloween morning

# Filter data for halloween morning
hallo <- data %>%
  filter(Startingdate == "2017-10-31") %>%
  filter(Startinghour < 8)
# Note that I used lubridate to make the columns Startingdate and Startinghour

# Plot rides for halloween morning
#ggplot() +
autoplot(map.latlon) + # Plots a map 
  geom_segment(data=hallo, 
               aes(x=start.long, y=start.lat,
                   xend=end.long, yend=end.lat), 
               color = 'blue', size = .5,
               arrow = arrow(length = unit(0.2, "cm"))) +
  labs(x='Longitude', y='Latitude') +
  scale_x_continuous(limits = c(-93.3, -93.2)) +
  scale_y_continuous(limits = c(44.93, 45.02)) +
  ggtitle('Halloween Morning Rides')

Don’t forget about your other plots such as: histograms, boxplots, and scatter plots. These can also be very informative.

Interesting Questions

There are a ton of different questions you can answer here. Each group member should have at least one unique question to answer. Some good examples are:

Come up with your own interesting questions! There are alot of different things you can investigate!